Data Preparation for Mining World Wide Web Browsing
نویسندگان
چکیده
The World Wide Web (WWW) continues to grow at an astounding rate in both the sheer volume of traac and the size and complexity of Web sites. The complexity of tasks such as Web site design, Web server design, and of simply navigating through a Web site have increased along with this growth. An important input to these design tasks is the analysis of how a Web site is being used. Usage analysis includes straightforward statistics, such as page access frequency, as well as more sophisticated forms of analysis, such as nding the common traver-sal paths through a Web site. Web Usage Mining is the application of data mining techniques to usage logs of large Web data repositories in order to produce results that can be used in the design tasks mentioned above. However, there are several preprocessing tasks that must be performed prior to applying data mining algorithms to the data collected from server logs. This paper presents several data preparation techniques in order to identify unique users and user sessions. Also, a method to divide user sessions into semantically meaningful transactions is deened and successfully tested against two other methods. Transactions identi-ed by the proposed methods are used to discover association rules from real world data using the WEBMINER system 15].
منابع مشابه
Designing a System for Trend Analysis of Users in Website Surfing in Iran Using Data Mining and Text Mining Algorithms
Background and Aim: As of the entrance of web surfing to the lifestyle of a vast majority of people in the society and the need for a more accurate social and cultural policy making in the field, authors intended to analyze the behavior of the society users in viewing different websites so as to help politicians and practitioners. Methods: Design science research method is used in this research...
متن کاملWeb Mining: Information and Pattern Discovery on the World Wide Web
Application of data mining techniques to the World Wide Web referred to as Web mining has been the focus of several recent research projects and papers However there is no established vocabulary leading to confusion when comparing research e orts The term Web mining has been used in two distinct ways The rst called Web content mining in this paper is the process of information discovery from so...
متن کاملUse of Ontology for Web Personalization Based on the Combination of Semantic Web Technologies and Usage Mining Techniques
During the past years the rapid and chaotic growth in the size and use of the World Wide Web continuously creates new great challenges and needs. Browsing can be improved and expedited by taking users' preferences in account, which results into personalization of web pages. In brief, web personalization can be defined as any action that customizes the information or services provided by a websi...
متن کاملWeb Usage Mining Structuring semantically enriched clickstream data
Web servers worldwide generate a vast amount of information on web users' browsing activities. Several researchers have studied these so-called clickstream or web access log data to better understand and characterize web users. Clickstream data can be enriched with information about the content of visited pages and the origin (e.g., geographic, organizational) of the requests. The goal of this ...
متن کاملQuery-Driven Conceptual Browsing: A Semi-Automated Approach for Building and Exploring Concepts on the Web
The presence of communities, which are groups of highly cross referenced pages together representing a single concept, is a striking feature of the World Wide Web. Quite often a group of communities, each topically coherent within itself, may be related through a common concept manifested in each of them. Motivated by this observation, we present a method for query-driven conceptual browsing fo...
متن کامل